home *** CD-ROM | disk | FTP | other *** search
- Path: keats.ugrad.cs.ubc.ca!not-for-mail
- From: c2a192@ugrad.cs.ubc.ca (Kazimir Kylheku)
- Newsgroups: comp.lang.c++,comp.lang.c,comp.os.ms-windows.programmer.misc
- Subject: Re: fastest code
- Date: 15 Apr 1996 09:01:27 -0700
- Organization: Computer Science, University of B.C., Vancouver, B.C., Canada
- Message-ID: <4ktrsnINNfb@keats.ugrad.cs.ubc.ca>
- References: <316112A2.7D37@public.sta.net.cn> <4kof6e$te@news1.mnsinc.com> <4krbhlINNbk4@keats.ugrad.cs.ubc.ca> <3171E8D2.41C67EA6@scn.de>
- NNTP-Posting-Host: keats.ugrad.cs.ubc.ca
-
- In article <3171E8D2.41C67EA6@scn.de>,
- Gerolf Wendland <wendland%hpp015%hpp001.mch2.scn.de@scn.de> wrote:
- >Kazimir Kylheku wrote:
- >It was addition instead of multiplication by constant values (i+i instead of 2*i)
- >
- >> If I were writing a compiler, I wouldn't bother ``optimizing'' this case at
- >> all: If the programmer thinks he knows better, let him have his addition!
- >
- >Let the programmer write down his way of solving the problem.
- >Let the compiler try to do the tideous optimization as much as possible.
-
- In this case, the programmer is clearly trying to second-guess the compiler's
- optimization. Why else would anyone write i+i nowadays? Maybe the programmer
- really _does_ know better. Hey may be privy to some information about the
- architecture at hand that the compiler writers did not know, and the addition
- may in fact produce faster code.
-
- Remember, the compiler would have to find a faster way to do it than using the
- addition in order for the optimization to be justified. There may be a
- machine-specific peephole optimization for this, but a compiler with a general
- front end which performs machine independent optimizations on an intermediate
- represention might have no reason at all to want to optimize a simple addition.
- To do that, it would have to pick a faster operation. On a machine-independent
- strength reduction scale, a shift and an add are roughly equal. There is no
- reason to choose one over the other.
-
- Within an architecture family there may be wide variations in what is fast
- code. I've seen hand-written assembly for an old Sparc model perform poorly
- compared to compiled C, probably because the code was compiled to take
- advantage of the new model's pipeline setup. For an example, all you need is
- to run the GNU fcrypt() test on a SuperSparc or UltraSparc, and compare the
- performance with the assembly code linked in to the pure C program.
-
- Suppose I have an older compiler that cannot take advantage of my latest
- binary-compatible CPU. If it second-guesses what I'm doing, it can produce
- slower code than if it tries to translate some of my idioms more literally.
-
- Maybe the newer CPU has multiple units for doing additions, but only one
- of them does shifting, and the programmer knows that...
-
- --
- I'm not really a jerk, but I play one on Usenet.
-